INSTITUTO POLITÉCNICO NACIONAL CENTRO DE INVESTIGACIÓN EN COMPUTACIÓN Laboratorio de Procesamiento de Lenguaje Natural Polarity summarization with opinion mining

نویسندگان

  • Iván Omar Cruz
  • Grigori Sidorov
  • Alexander Gelbukh
چکیده

This thesis work presents a novel framework for Opinion Summarization. This task consist in the extraction of parts of a opinionated text that represents the main opinions and build a legible and human-readable opinion summary. This task is very difficult and it involves several methods related to Opinion Mining, Natural Language Processing, Machine Learning, Information Retrieval and other areas. The main goal of this work is to expand the state of the art in Opinion Mining by researching in new ways for Opinion Summarization and the tasks it involves. Another goal of this work is to provide to the scientific community new tools to tackle challenges in this area. At the moment of writing this thesis there are few, if any, public available opinion summarization tools. The present work explains in detail the models that describes the phenomena found investigating this task, the different methods used and/or developed for Opinion Summarization, their implementation and the results obtained. An opinion summarization tool receives an opinionated text document. The output could be the sentences or phrases of the given document that summarize better the opinions found in the text. Another possibility would be some sort of structured summary describing the quantitative properties of the opinions found in the document. This work uses both approaches. The proposed method for opinion summarization can be described in 3 steps: • First it is necessary to decompose the input document into sentences and determine which of these sentences are opinionated, obtaining their polarity in the process.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluación de las herramientas comerciales y métodos del estado del arte para la generación de resúmenes extractivos individuales

1 Universidad Autónoma del Estado de México Unidad Académica Profesional Tianguistenco Instituto Literario #100, Col. Centro, Toluca, 50000, Estado de México [email protected], [email protected], [email protected] 2 Laboratorio de Lenguaje Natural y Procesamiento de Texto, Centro de Investigación en Computación, Instituto Politécnico Nacional, DF 07738, México sidorov@ci...

متن کامل

Association of rs712 polymorphism in a let-7 microRNA-binding site of KRAS gene with colorectal cancer in a Mexican population

Objective(s): The rs712 polymorphism in a let-7 microRNA-binding site at KRAS gene has been associated with cancer. To examine its association with rs712 polymorphism, we analyzed Mexican individuals with colorectal cancer (CRC) and healthy subjects. Materials and Methods: Genotyping of the rs712 polymorphism was performed by polymerase chain reaction in 281 controls and 336 CRC patients. Resul...

متن کامل

Clasificación de polaridad en textos con opiniones en español mediante análisis sintáctico de dependencias

This article describes an opinion mining system that classifies the polarity of Spanish texts. We propose a nlp-based approach which performs segmentation, tokenization and pos tagging of texts to then obtain the syntactic structure of sentences by means of a dependency parser. The syntactic structure is then used to address three of the most significant linguistic constructions in the area in ...

متن کامل

Intramuscular Fatty Acid Composition of the Longissimus Muscle of Unweaned Minhota Breed Calves at Different Slaughter Age

Meat productions from sixteen local Portuguese cattle breeds represent high economic and cultural value for local populations. Among these, Minhota is one of the most important on meat aptitude located on the northwest of the country. This breed is used for high-quality meat. This study describes the influence of slaughter age, corresponding veal (6 months) and beef (9 months) and sex, reared i...

متن کامل

MPRO: Un programa para el análisis morfológico y sintáctico en textos en español

Johann Haller IAI –Instituto de Ciencia Aplicada de la InformaciónUniversidad de Saarland Martin-Luther-Strasse 14, 66111, Saarbrücken, Alemania [email protected] Alexis Donoso IAI –Instituto de Ciencia Aplicada de la InformaciónUniversidad de Saarland Martin-Luther-Strasse 14, 66111, Saarbrücken, Alemania [email protected] Yamile Ramírez IAI –Instituto de Ciencia Aplicada de la InformaciónU...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014